-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Shenandoah support #10904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Shenandoah support #10904
Conversation
If you have any questions about how to structure your changes feel free to ping me over slack as I was the primary author of the new LIR support for barriers. |
Thanks, Tom! I will do that whenever I get stuck or have questions. So far I'm making progress. Structurally, Shenandoah will be look like a mix of ZGC (for the load-barrier, even though I am modeling it as a Node that consumes the loaded value, instead of replacing the ReadNode altogether) and G1 (for the SATB parts), and likely Serial/Parallel for the card-table parts. |
Sounds good. There's still some more work to finish out the switch to LIR only barriers but I think supporting G1 and ZGC covers the required strategies in a fairly pragmatic way. |
@tkrodriguez can you please take another look at this. Once done, we can ask @davleopo and @gergo- to look at it. |
I'll take a look. |
Are the gate failures actual problems? It would be good to see a clean gate. Also, we should squash the history before committing. |
I don't know what those problems are. I fixed everything that looked related to my changes. Those failures look like infra problems, some volumes seem to have run out of memory or something. I doubt that it is related.
|
compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/core/phases/LowTier.java
Outdated
Show resolved
Hide resolved
.../src/jdk.graal.compiler/src/jdk/graal/compiler/nodes/gc/shenandoah/ShenandoahBarrierSet.java
Outdated
Show resolved
Hide resolved
@tkrodriguez There seem to be GHA failures that report SerialWriteBarriers not being Lowerable, coming from SubstrateVM. Could this be related to moving the barrier addition phase from mid- to low-tier? I don't think I have changed anything in SerialWriteBarrier or related code. |
f8dac0f
to
9f2c78d
Compare
Yes this is something I mentioned in our slack discussions. Moving The options are to conditionalize the placement of I'm going to try putting |
As far as I can see, it's only the SerialWriteBarrierNode which depends on snippets (is that right?). That should be relatively straightforward to implement without snippets and would look almost exactly like ShenandoahCardBarrierNode implementation - even a little simpler. I could work on implementing that, if you think that'd help. |
It would be easy to convert the HotSpot serial barrier to LIR but native image uses snippets for its barriers and its serial barrier is non-trivial. So we'll need to live with this mixed model for a little while I think. I'm beginning to think we might just need an early and late phase. I think the way barriers for vector writes work, we might have to do barrier addition for them before LowTierLowering, or at least before VectorLoweringPhase, which is before FixReadsPhase. It might be too much to try to resolve all these issues in this PR. Since Shenandoah is currently HotSpot only maybe it would be best to special case its barrier insertion. I'll will play some more with this to see what would be best. |
I implemented barrier addition to trigger in both mid- and low-tier, and the BarrierSet implementation gets to choose which one (or both, if it wishes) is appropriate. This choice defaults to mid-tier, and can be overridden in implementations, like I did in ShenandoahBarrierSet. This way we get a clean gate :-) |
Thanks. I'll see whether that works ok in our full gate. I'd tried something slightly different but the |
Your fix works though I don't love some of the details. I'll just put comments on those places. I was able to get a clean internal gate with just minor changes in enterprise. I wasn't actually able to test Shenandoah because we don't have a labsjdk that includes it at the moment. |
...jdk.graal.compiler.test/src/jdk/graal/compiler/hotspot/test/DeferredBarrierAdditionTest.java
Outdated
Show resolved
Hide resolved
...graal.compiler/src/jdk/graal/compiler/nodes/gc/shenandoah/ShenandoahPreWriteBarrierNode.java
Outdated
Show resolved
Hide resolved
...graal.compiler/src/jdk/graal/compiler/nodes/gc/shenandoah/ShenandoahPreWriteBarrierNode.java
Outdated
Show resolved
Hide resolved
compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/nodes/GraphState.java
Outdated
Show resolved
Hide resolved
...r/src/jdk.graal.compiler/src/jdk/graal/compiler/phases/common/WriteBarrierAdditionPhase.java
Outdated
Show resolved
Hide resolved
5c691ed
to
e9de7db
Compare
compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/GraalHotSpotVMConfig.java
Show resolved
Hide resolved
compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/hotspot/HotSpotShenandoahBarrierSet.java
Show resolved
Hide resolved
...raal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahBarrierSetLIRGenerator.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../src/jdk/graal/compiler/hotspot/amd64/shenandoah/AMD64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../src/jdk/graal/compiler/hotspot/amd64/shenandoah/AMD64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../src/jdk.graal.compiler/src/jdk/graal/compiler/nodes/gc/shenandoah/ShenandoahBarrierSet.java
Show resolved
Hide resolved
public final class ShenandoahLoadRefBarrierNode extends ValueNode implements LIRLowerable { | ||
public static final NodeClass<ShenandoahLoadRefBarrierNode> TYPE = NodeClass.create(ShenandoahLoadRefBarrierNode.class); | ||
|
||
public enum ReferenceStrength { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc please
Thanks very nice work, looks pretty ready I did not see any blockers from my side. I added some style/unification comments. One question is if we should already add necessary |
I forgot to mention here is the SyncPort class we use it to denote hot spot ports. When the upstream implementation changes we are informed in our internal testing. |
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
.../jdk/graal/compiler/hotspot/aarch64/shenandoah/AArch64HotSpotShenandoahLoadRefBarrierOp.java
Show resolved
Hide resolved
I've started doing some gate testing and found one problem.
throws a lot of exceptions like this and never completes
|
and This test might be useful for you to look more closely at. I'd originally written it as a testbed for the ZGC port. It's designed so that you can compare the output of C2 and Graal for all the core barrier patterns. See https://github.com/oracle/graal/blob/master/compiler/src/jdk.graal.compiler.test/src/jdk/graal/compiler/hotspot/test/GCBarrierEmissionTest.java#L53. I have a little utility to split each nmethod dump into it's own file so you can do pairwise comparison if you are interested in that. |
One ugly missing piece is support for the fast truffle dispatch in the truffle implementation of |
This implements the barriers that are needed to run with Shenandoah GC in the Graal compiler. (Issue: #3472)
There are 3 basic kinds of barriers needed for Shenandoah:
Notice that none of the barriers are implemented as snippets (like Serial/Parallel's card-barriers) or in the backend-only (as ZGC's read-barriers). We needed a way to efficiently deal with compressed-oops, which is not (easily) possible to do in the backend. In the node-graph this is pretty easy: insert the LRB with preceding and succeeding uncompress/compress after any load and before the (potential) uncompress (i.e. turn load->uncompress into load -> (uncompress -> lrb -> compress) -> uncompress) and then let the optimizer optimize away the trailing compress -> uncompress pairs.
In order to support this, we needed a few additions:
X86 port contributed by @JohnTortugo.
Testing:
(We have run those workloads for correctness testing only, we have not (yet) conducted a performance study.)